{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# LlamaParse - Fast checking Insurance Contract for Coverage\n",
    "\n",
    "<a href=\"https://colab.research.google.com/github/run-llama/llama_parse/blob/main/examples/demo_insurance.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
    "\n",
    "In this notebook we will look at how LlamaParse can be used to extract structured coverage information from an insurance policy."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Installation of required packages"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install llama-index llama-parse"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Download an insurance policy fron IRDAI\n",
    "\n",
    "The Insurance Regulatory and Development Authority of India (IRDAI) maintains a great resource: https://policyholder.gov.in/web/guest/non-life-insurance-products where all insurance policies available in India are publicly available for download! Let's download a complex health insurance policy as an example."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!wget \"https://policyholder.gov.in/documents/37343/931203/NBHTGBP22011V012223.pdf/c392bcc1-f6a8-cadd-ab84-495b3273d2c3?version=1.0&t=1669350459879&download=true\" -O \"./policy.pdf\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Initializing LlamaIndex and LlamaParse"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# llama-parse is async-first, running the sync code in a notebook requires the use of nest_asyncio\n",
    "import nest_asyncio\n",
    "\n",
    "nest_asyncio.apply()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "os.environ[\"LLAMA_CLOUD_API_KEY\"] = \"llx-...\"\n",
    "os.environ[\"OPENAI_API_KEY\"] = \"sk-...\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.llms.openai import OpenAI\n",
    "from llama_index.embeddings.openai import OpenAIEmbedding\n",
    "from llama_index.core import VectorStoreIndex\n",
    "from llama_index.core import Settings\n",
    "\n",
    "# for the purpose of this example, we will use the small model embedding and gpt3.5\n",
    "embed_model = OpenAIEmbedding(model=\"text-embedding-3-small\")\n",
    "llm = OpenAI(model=\"gpt-3.5-turbo-0125\")\n",
    "\n",
    "Settings.llm = llm"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Vanilla Approach - Parse the Policy with LlamaParse into Markdown"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Started parsing the file under job_id b8946573-c911-4e00-8921-1bad1cda3d64\n",
      "......"
     ]
    }
   ],
   "source": [
    "from llama_parse import LlamaParse\n",
    "\n",
    "documents = LlamaParse(result_type=\"markdown\").load_data(\"./policy.pdf\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "## Preamble\n",
      "\n",
      "This ‘Travel Infinity’ Policy is a contract of insurance between You and Us which is subject to payment of full premium in advance and the terms, conditions and exclusions of this Policy. Expense incurred outside the policy period will NOT be covered. Unutilized Sum Insured will expire at the end of the policy year. All applicable benefits, details and limits are mentioned in your Certificate of insurance. We will cover only allopathic treatments in this policy.\n",
      "\n",
      "## Defined Terms\n",
      "\n",
      "The terms listed below in this Section and used elsewhere in the Policy in Initial Capitals shall have the meaning set out against them in this Section.\n",
      "\n",
      "### Standard Definitions\n",
      "\n",
      "|2.1|Accident or Accidental|means sudden, unforeseen and involuntary event caused by external, visible and violent means.|\n",
      "|---|---|---|\n",
      "|2.2|Co-payment|means a cost sharing requirement under a health insurance policy that provides that the policyholder/insured will bear a specified percentage of the admissible claims a\n"
     ]
    }
   ],
   "source": [
    "print(documents[0].text[0:1000])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Markdown Element Node Parser\n",
    "Our markdown element node parser works well for parsing the markdown output of LlamaParse into a set of table and text nodes."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.node_parser import MarkdownElementNodeParser\n",
    "\n",
    "node_parser = MarkdownElementNodeParser(\n",
    "    llm=OpenAI(model=\"gpt-3.5-turbo-0125\"), num_workers=8\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "nodes = node_parser.get_nodes_from_documents(documents)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "base_nodes, objects = node_parser.get_nodes_and_objects(nodes)\n",
    "\n",
    "recursive_index = VectorStoreIndex(nodes=base_nodes + objects)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "query_engine = recursive_index.as_query_engine(similarity_top_k=25)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Querying the model for coverage"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "You are covered for the expenses incurred on any alternate travel booking under any mode of transport, up to the limit of the Sum Insured as mentioned in the Certificate of insurance, if the delay of the airlines was caused due to specific reasons outlined in the policy. The amount you are covered for will depend on the specific terms and conditions of your policy, including the maximum coverage limit specified in the Certificate of insurance.\n"
     ]
    }
   ],
   "source": [
    "query_1 = \"My trip was delay and I paid 45, how much am I cover for?\"\n",
    "\n",
    "response_1 = query_engine.query(query_1)\n",
    "print(str(response_1))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The information is split across the document which leads to retrieval issues. Let's try some parsing instructions to improve our result."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Started parsing the file under job_id ec9e77c9-6ad9-4c9b-9efb-c9f659b0d481\n",
      "....."
     ]
    }
   ],
   "source": [
    "documents_with_instruction = LlamaParse(\n",
    "    result_type=\"markdown\",\n",
    "    parsing_instruction=\"\"\"\n",
    "This document is an insurance policy.\n",
    "When a benefits/coverage/exlusion is describe in the document ammend to it add a text in the follwing benefits string format (where coverage could be an exclusion).\n",
    "\n",
    "For {nameofrisk} and in this condition {whenDoesThecoverageApply} the coverage is {coverageDescription}. \n",
    "                                        \n",
    "If the document contain a benefits TABLE that describe coverage amounts, do not ouput it as a table, but instead as a list of benefits string.\n",
    "                                       \n",
    "\"\"\",\n",
    ").load_data(\"./policy.pdf\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let see how the 2 parsing compare (change target page to explore)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "## Inpatient treatment\n",
      "\n",
      "Claim Form (filled and signed by pe Insured)\n",
      "Hospital Daily Cash\n",
      "Release of Medical information Form (filled and signed by pe Insured)\n",
      "Waiver of Deductible\n",
      "Original papological and diagnostic reports, discharge summary indoor case papers (if any) and prescriptions issued by pe treating Medical practitioner or Network Provider\n",
      "Optional Co-payment\n",
      "Adventure Sports Cover\n",
      "Home to Home Cover\n",
      "Passport and Visa copy wip Entry Stamp of Country of Visit and exit Stamp from India\n",
      "Extension to in-patient care\n",
      "Ambulance Charge\n",
      "FIR report of police (if applicable)\n",
      "\n",
      "## Out-patient treatment\n",
      "\n",
      "Cancer Screening & Mammographic Examination\n",
      "Original bills and receipts for:\n",
      "1. Charges paid towards Hospital accommodation, nursing facilities, and oper medical services rendered\n",
      "2. Fees paid to pe Medical Practitioner and for special nursing charges\n",
      "3. Charges incurred towards any and all test and / or examinations rendered in connection wip pe treatment\n",
      "4. Charges incurred towards medicines or drugs purchased from a registered pharmacy oper pan pe Network provider duly supported by pe prescriptions of pe Medical Practitioner attending to pe Insured Person\n",
      "5. Any oper document as required by pe Company to assist pe Claim\n",
      "\n",
      "## Medical evacuation\n",
      "\n",
      "Medical reports and transportation details issued by the evacuation agency, prescriptions and medical report by the attending Medical Practitioner furnishing the name of the Insured Person and details of treatment rendered along with the statement confirming the necessity of evacuation.\n",
      "\n",
      "Documentary proof for expenses incurred towards the Medical Evacuation.\n",
      "\n",
      "## Compassionate visit\n",
      "\n",
      "A certificate from the Medical Practitioner recommending the presence in the form of special assistance to be rendered by an additional member during the entire period of hospitalization. The certificate shall also specify the minimum period in which person is admitted in the hospital.\n",
      "\n",
      "Discharge summary of the Hospital furnishing details including the date of admission and date of discharge.\n",
      "\n",
      "Stamped boarding pass with invoice used for the travel by the Immediate Family Member.\n",
      "\n",
      "Copy passport of Immediate Family Member with entry and exit stamp.\n",
      "\n",
      "## Escort of Minor Child\n",
      "\n",
      "A certificate from the Medical Practitioner specifying the minimum period of Hospitalization.\n",
      "\n",
      "Discharge summary of the Hospital furnishing details including the date of admission and date of discharge.\n",
      "\n",
      "Stamped Boarding pass used for the return travel of the child to the Country of Residence.\n",
      "\n",
      "Stamped Boarding pass of the attendant from the Country of Residence to the place of hospitalization (if attendant is necessary).\n",
      "\n",
      "Copy of passport of the child with entry and exit stamp.\n",
      "\n",
      "## Upgradation to Business Class\n",
      "\n",
      "A certificate from the Medical Practitioner specifying the minimum period of Hospitalization.\n",
      "\n",
      "Discharge summary of the Hospital furnishing the details including the date of admission and date of discharge.\n",
      "\n",
      "Product Name: Travel infinity | Product UIN: NBHTGBP22011V012223\n",
      "\n",
      "\n",
      "=========================================================\n",
      "\n",
      "\n",
      "# Insurance Policy\n",
      "\n",
      "## Benefits:\n",
      "\n",
      "- For Inpatient treatment and in this condition when admitted to a hospital, the coverage is reimbursement for medical expenses incurred.\n",
      "- For Hospital Daily Cash and in this condition when hospitalized, the coverage is daily cash benefit.\n",
      "- For Waiver of Deductible and in this condition when a deductible is applicable, the coverage is waiver of the deductible amount.\n",
      "- For Optional Co-payment and in this condition when a co-payment is required, the coverage is optional co-payment.\n",
      "- For Adventure Sports Cover and in this condition when participating in adventure sports, the coverage is coverage for injuries related to adventure sports.\n",
      "- For Home to Home Cover and in this condition when requiring medical evacuation, the coverage is assistance for repatriation to home country.\n",
      "- For Extension to in-patient care and in this condition when extended hospital stay is necessary, the coverage is extension of coverage for in-patient care.\n",
      "- For Ambulance Charge and in this condition when ambulance services are utilized, the coverage is reimbursement for ambulance charges.\n",
      "- For Out-patient treatment and in this condition when receiving outpatient medical care, the coverage is reimbursement for outpatient medical expenses.\n",
      "- For Cancer Screening & Mammographic Examination and in this condition when undergoing cancer screening or mammographic examination, the coverage is coverage for these preventive services.\n",
      "- For New Born baby Cover and in this condition when a newborn is covered under the policy, the coverage is medical expenses coverage for the newborn.\n",
      "- For Maternity and in this condition when maternity services are required, the coverage is coverage for maternity expenses.\n",
      "- For Complete pre-existing disease cover and in this condition when seeking treatment for pre-existing conditions, the coverage is coverage for pre-existing conditions.\n",
      "- For Medical sum insured replenishment in case of hospitalization due to accident and in this condition when hospitalized due to an accident, the coverage is replenishment of the sum insured.\n",
      "- For Waiver of sublimit for insured above 60 years of age and in this condition when the insured is above 60 years of age, the coverage is waiver of sublimits.\n",
      "- For Psychiatric Counseling and in this condition when seeking psychiatric counseling, the coverage is coverage for psychiatric counseling services.\n",
      "- For Physiotherapy and in this condition when undergoing physiotherapy, the coverage is coverage for physiotherapy sessions.\n",
      "- For Terrorism cover and in this condition when affected by terrorism, the coverage is coverage for medical expenses related to terrorism incidents.\n",
      "- For Medical tele-consultation and in this condition when consulting a medical practitioner remotely, the coverage is coverage for tele-consultation services.\n",
      "- For Medical evacuation and in this condition when requiring medical evacuation, the coverage is coverage for medical evacuation services.\n",
      "- For Compassionate visit and in this condition when requiring a compassionate visit, the coverage is coverage for travel expenses for a family member to visit.\n",
      "- For Escort of Minor Child and in this condition when escorting a minor child for medical treatment, the coverage is coverage for escort services for the child.\n",
      "- For Upgradation to Business Class and in this condition when requiring upgradation to business class for medical travel, the coverage is coverage for upgradation to business class.\n"
     ]
    }
   ],
   "source": [
    "target_page = 45\n",
    "pages_vanilla = documents[0].text.split(\"\\n---\\n\")\n",
    "pages_with_instructions = documents_with_instruction[0].text.split(\"\\n---\\n\")\n",
    "\n",
    "print(pages_vanilla[target_page])\n",
    "print(\"\\n\\n=========================================================\\n\\n\")\n",
    "print(pages_with_instructions[target_page])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "node_parser_instruction = MarkdownElementNodeParser(\n",
    "    llm=OpenAI(model=\"gpt-3.5-turbo-0125\"), num_workers=8\n",
    ")\n",
    "nodes_instruction = node_parser.get_nodes_from_documents(documents_with_instruction)\n",
    "(\n",
    "    base_nodes_instruction,\n",
    "    objects_instruction,\n",
    ") = node_parser_instruction.get_nodes_and_objects(nodes_instruction)\n",
    "\n",
    "recursive_index_instruction = VectorStoreIndex(\n",
    "    nodes=base_nodes_instruction + objects_instruction\n",
    ")\n",
    "query_engine_instruction = recursive_index_instruction.as_query_engine(\n",
    "    similarity_top_k=25\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Comparing Instruction-Augmented Parsing vs. Vanilla Parsing\n",
    "\n",
    "When we parse the document with natural language instructions to add context on insurance coverage, we are able to correctly answer a wide range of queries in our RAG pipeline. In contrast, a RAG pipeline built with the vanilla method is not able to answer these queries."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Vanilla:\n",
      "You are covered for the amount you paid due to the trip delay, up to the limit specified in the certificate of insurance.\n",
      "With instructions:\n",
      "For Trip Delay coverage, you are covered for a fixed benefit amount as mentioned in the certificate of insurance for every block of hours of delay.\n"
     ]
    }
   ],
   "source": [
    "query_1 = \"My trip was delayed and I paid 45, how much am I covered for?\"\n",
    "\n",
    "response_1 = query_engine.query(query_1)\n",
    "print(\"Vanilla:\")\n",
    "print(response_1)\n",
    "\n",
    "print(\"With instructions:\")\n",
    "response_1_i = query_engine_instruction.query(query_1)\n",
    "print(response_1_i)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Looking at the policy it says in list I that one expense not covered is Baby food"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Vanilla:\n",
      "Baby food is not explicitly mentioned in the provided context information regarding insurance coverages and benefits.\n",
      "With instructions:\n",
      "Baby food is excluded from coverage according to the policy terms.\n"
     ]
    }
   ],
   "source": [
    "query_2 = \"I just had a baby, is baby food covered?\"\n",
    "\n",
    "response_2 = query_engine.query(query_2)\n",
    "print(\"Vanilla:\")\n",
    "print(response_2)\n",
    "\n",
    "print(\"With instructions:\")\n",
    "response_2_i = query_engine_instruction.query(query_2)\n",
    "print(response_2_i)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Vanilla:\n",
      "Gauze used in your operation would typically be covered under the \"Emergency In-patient Medical Treatment\" or \"Emergency In-patient Medical Treatment with OPD\" benefits of the policy.\n",
      "With instructions:\n",
      "Gauze is not covered for use in your operation as it falls under the category of items that are excluded from coverage in the insurance policy.\n"
     ]
    }
   ],
   "source": [
    "query_3 = \"How is gauze used in my operation covered?\"\n",
    "\n",
    "response_3 = query_engine.query(query_3)\n",
    "print(\"Vanilla:\")\n",
    "print(response_3)\n",
    "\n",
    "print(\"With instructions:\")\n",
    "response_3_i = query_engine_instruction.query(query_3)\n",
    "print(response_3_i)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "llama_parse",
   "language": "python",
   "name": "llama_parse"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
